travel plan
VIR-Bench: Evaluating Geospatial and Temporal Understanding of MLLMs via Travel Video Itinerary Reconstruction
Wang, Hao, Murata, Eiki, Zhang, Lingfang, Sato, Ayako, Fukuda, So, Yin, Ziqi, Hu, Wentao, Nakao, Keisuke, Nakamura, Yusuke, Zwirner, Sebastian, Chen, Yi-Chia, Otomo, Hiroyuki, Ouchi, Hiroki, Kawahara, Daisuke
Recent advances in multimodal large language models (MLLMs) have significantly enhanced video understanding capabilities, opening new possibilities for practical applications. Yet current video benchmarks focus largely on indoor scenes or short-range outdoor activities, leaving the challenges associated with long-distance travel largely unexplored. Mastering extended geospatial-temporal trajectories is critical for next-generation MLLMs, underpinning real-world tasks such as embodied-AI planning and navigation. To bridge this gap, we present VIR-Bench, a novel benchmark consisting of 200 travel videos that frames itinerary reconstruction as a challenging task designed to evaluate and push forward MLLMs' geospatial-temporal intelligence. Experimental results reveal that state-of-the-art MLLMs, including proprietary ones, struggle to achieve high scores, underscoring the difficulty of handling videos that span extended spatial and temporal scales. Moreover, we conduct an in-depth case study in which we develop a prototype travel-planning agent that leverages the insights gained from VIR-Bench. The agent's markedly improved itinerary recommendations verify that our evaluation protocol not only benchmarks models effectively but also translates into concrete performance gains in user-facing applications.
- Consumer Products & Services > Travel (1.00)
- Transportation > Infrastructure & Services (0.93)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
IMAGINE: Integrating Multi-Agent System into One Model for Complex Reasoning and Planning
Zhang, Xikai, Wang, Bo, Xiao, Likang, Li, Yongzhi, Chen, Quan, Wu, Wenju, Liu, Liu
Although large language models (LLMs) have made significant strides across various tasks, they still face significant challenges in complex reasoning and planning. For example, even with carefully designed prompts and prior information explicitly provided, GPT-4o achieves only a 7% Final Pass Rate on the TravelPlanner dataset in the sole-planning mode. Similarly, even in the thinking mode, Qwen3-8B-Instruct and DeepSeek-R1-671B, only achieve Final Pass Rates of 5.9% and 40%, respectively. Although well-organized Multi-Agent Systems (MAS) can offer improved collective reasoning, they often suffer from high reasoning costs due to multi-round internal interactions, long per-response latency, and difficulties in end-to-end training. To address these challenges, we propose a general and scalable framework called IMAGINE, short for Integrating Multi-Agent System into One Model. This framework not only integrates the reasoning and planning capabilities of MAS into a single, compact model, but also significantly surpass the capabilities of the MAS through a simple end-to-end training. Through this pipeline, a single small-scale model is not only able to acquire the structured reasoning and planning capabilities of a well-organized MAS but can also significantly outperform it. Experimental results demonstrate that, when using Qwen3-8B-Instruct as the base model and training it with our method, the model achieves an 82.7% Final Pass Rate on the TravelPlanner benchmark, far exceeding the 40% of DeepSeek-R1-671B, while maintaining a much smaller model size.
TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning
Chaudhuri, Soumyabrata, Purkar, Pranav, Raghav, Ritwik, Mallick, Shubhojit, Gupta, Manish, Jana, Abhik, Ghosh, Shreya
Recent advancements in probing Large Language Models (LLMs) have explored their latent potential as personalized travel planning agents, yet existing benchmarks remain limited in real world applicability. Existing datasets, such as TravelPlanner and TravelPlanner+, suffer from semi synthetic data reliance, spatial inconsistencies, and a lack of key travel constraints, making them inadequate for practical itinerary generation. To address these gaps, we introduce TripCraft, a spatiotemporally coherent travel planning dataset that integrates real world constraints, including public transit schedules, event availability, diverse attraction categories, and user personas for enhanced personalization. To evaluate LLM generated plans beyond existing binary validation methods, we propose five continuous evaluation metrics, namely Temporal Meal Score, Temporal Attraction Score, Spatial Score, Ordering Score, and Persona Score which assess itinerary quality across multiple dimensions. Our parameter informed setting significantly enhances meal scheduling, improving the Temporal Meal Score from 61% to 80% in a 7 day scenario. TripCraft establishes a new benchmark for LLM driven personalized travel planning, offering a more realistic, constraint aware framework for itinerary generation. Dataset and Codebase will be made publicly available upon acceptance.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > India > Nagaland (0.04)
- North America > United States > Tennessee (0.04)
- (10 more...)
Ask-before-Plan: Proactive Language Agents for Real-World Planning
Zhang, Xuan, Deng, Yang, Ren, Zifeng, Ng, See-Kiong, Chua, Tat-Seng
The evolution of large language models (LLMs) has enhanced the planning capabilities of language agents in diverse real-world scenarios. Despite these advancements, the potential of LLM-powered agents to comprehend ambiguous user instructions for reasoning and decision-making is still under exploration. In this work, we introduce a new task, Proactive Agent Planning, which requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction, invoke external tools to collect valid information, and generate a plan to fulfill the user's demands. To study this practical problem, we establish a new benchmark dataset, Ask-before-Plan. To tackle the deficiency of LLMs in proactive planning, we propose a novel multi-agent framework, Clarification-Execution-Planning (\texttt{CEP}), which consists of three agents specialized in clarification, execution, and planning. We introduce the trajectory tuning scheme for the clarification agent and static execution agent, as well as the memory recollection mechanism for the dynamic execution agent. Extensive evaluations and comprehensive analyses conducted on the Ask-before-Plan dataset validate the effectiveness of our proposed framework.
- North America > United States > California > San Diego County > San Diego (0.05)
- Europe > Spain > Galicia > Madrid (0.04)
- Asia > India > Nagaland (0.04)
- (5 more...)
TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners
de la Rosa, Tomas, Gopalakrishnan, Sriram, Pozanco, Alberto, Zeng, Zhen, Borrajo, Daniel
Travel planning is a complex task that involves generating a sequence of actions related to visiting places subject to constraints and maximizing some user satisfaction criteria. Traditional approaches rely on problem formulation in a given formal language, extracting relevant travel information from web sources, and use an adequate problem solver to generate a valid solution. As an alternative, recent Large Language Model (LLM) based approaches directly output plans from user requests using language. Although LLMs possess extensive travel domain knowledge and provide high-level information like points of interest and potential routes, current state-of-the-art models often generate plans that lack coherence, fail to satisfy constraints fully, and do not guarantee the generation of high-quality solutions. We propose TRIP-PAL, a hybrid method that combines the strengths of LLMs and automated planners, where (i) LLMs get and translate travel information and user information into data structures that can be fed into planners; and (ii) automated planners generate travel plans that guarantee constraint satisfaction and optimize for users' utility. Our experiments across various travel scenarios show that TRIP-PAL outperforms an LLM when generating travel plans.
- South America > Uruguay > Colonia > Colonia del Sacramento (0.04)
- North America > United States > New York (0.04)
- North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models
Recent studies have highlighted their proficiency in some simple tasks like writing and coding through various reasoning strategies. However, LLM agents still struggle with tasks that require comprehensive planning, a process that challenges current models and remains a critical research issue. In this study, we concentrate on travel planning, a Multi-Phases planning problem, that involves multiple interconnected stages, such as outlining, information gathering, and planning, often characterized by the need to manage various constraints and uncertainties. Existing reasoning approaches have struggled to effectively address this complex task. Our research aims to address this challenge by developing a human-like planning framework for LLM agents, i.e., guiding the LLM agent to simulate various steps that humans take when solving Multi-Phases problems. Specifically, we implement several strategies to enable LLM agents to generate a coherent outline for each travel query, mirroring human planning patterns. Additionally, we integrate Strategy Block and Knowledge Block into our framework: Strategy Block facilitates information collection, while Knowledge Block provides essential information for detailed planning. Through our extensive experiments, we demonstrate that our framework significantly improves the planning capabilities of LLM agents, enabling them to tackle the travel planning task with improved efficiency and effectiveness. Our experimental results showcase the exceptional performance of the proposed framework; when combined with GPT-4-Turbo, it attains $10\times$ the performance gains in comparison to the baseline framework deployed on GPT-4-Turbo.
- North America > United States > California > San Francisco County > San Francisco (0.06)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.06)
- North America > United States > New York (0.05)
- (9 more...)
A Summarized History-based Dialogue System for Amnesia-Free Prompt Updates
Hong, Hyejin, Kawano, Hibiki, Maekawa, Takuto, Yoshimaru, Naoki, Iio, Takamasa, Hatano, Kenji
In today's society, information overload presents challenges in providing optimal recommendations. Consequently, the importance of dialogue systems that can discern and provide the necessary information through dialogue is increasingly recognized. However, some concerns existing dialogue systems rely on pre-trained models and need help to cope with real-time or insufficient information. To address these concerns, models that allow the addition of missing information to dialogue robots are being proposed. Yet, maintaining the integrity of previous conversation history while integrating new data remains a formidable challenge. This paper presents a novel system for dialogue robots designed to remember user-specific characteristics by retaining past conversation history even as new information is added.
Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System
Hirai, Ryu, Iizuka, Shinya, Iseno, Haruhisa, Guo, Ao, Jiang, Jingjing, Ohashi, Atsumoto, Higashinaka, Ryuichiro
At the Dialogue Robot Competition 2023 (DRC2023), which was held to improve the capability of dialogue robots, our team developed a system that could build common ground and take more natural turns based on user utterance texts. Our system generated queries for sightseeing spot searches using the common ground and engaged in dialogue while waiting for user comprehension.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.06)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- Asia > Japan > Honshū > Tōhoku (0.05)
Tripedia Revolutionizes the Travel Ecosystem With an Artificial Intelligence Travel Designer
Customers provide Tripedia with their information and travel preferences. AITD uses the data it has on previous reviews, accommodation, and flight availability. Through this, it recommends the best travel plan for them. This type of learning saves time and energy compared to traditional machine-learning algorithms. Using blockchain, the self-evolving system learns consumers' travel preferences as they emerge.
How AI and Tech Personal Assistants Will Make Your Life Easier
When you picture a personal assistant, you probably envision a celebrity -- or at least someone very wealthy -- walking around with a "shadow" who takes her messages, makes her travel plans, helps her pick out the perfect outfit, and eliminates other daily tasks from her lengthy to-do list. If you've ever wished you could have a personal assistant of your own, then you're in luck. Personal assistants are quickly becoming accessible to the masses -- thanks to AI and other emerging tech. In fact, technology really has been an equalizer when it comes to providing everyday people access to personal concierge services. Besides Siri or Google on your smartphone, Amazon's Alexa and Google Home may be some of the first digital "personal" assistants that come to mind.
- North America > United States > New York (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)